CV4Edu - Computer Vision for Education
Computer vision (CV) plays a central role in multimodal human-centered AI, yet most models are trained on web-scale benchmarks that poorly reflect real classrooms. Educational data are noisy, private, small-scale, and multimodal (e.g., video, audio, text). Students’ cognitive/behavioral states (e.g., engagement, mind-wandering) and learning processes (e.g., self-regulation, collaboration) can be inferred from subtle multimodal cues (e.g., gaze, pose, facial features). Still, today’s models struggle to generalize to classroom data, limiting reliability in deployed human-centered applications (e.g., assistive technology, collaborative AI). CV4Edu brings together computer vision, natural language processing, human-computer interaction, and educational researchers to chart a community agenda for efficient, privacy-aware multimodal data-driven models that are more reliable in low-resource, real-world classroom settings — potentially launching shared datasets, metrics, and unified practices.
Our goal is to support research that bridges CV, NLP, HCI, cognitive science, and the learning sciences/education communities. We welcome submissions both within and beyond education contexts—such as multimodal modeling, sensing, behavior forecasting, cognitive state inference, robotics, and embodied AI—provided they discuss transferability to classroom settings (e.g., what may break or carry over under noise, occlusions, viewpoints, multi-person dynamics, privacy constraints, limited annotations, distribution shift, hardware variability).
Topics
The workshop topics include (but are not limited to):
Multimodal classroom perception
- Face, gaze, pose, gesture, posture, affect, and prosody
- Video, audio, gaze sensors, and wearables (egocentric and exocentric)
- Multimodal fusion, representation learning, and cross-view / multi-camera setups
Language-centered multimodal learning analytics
- Linking speech/text to video events, gaze/attention, and instructional context
- Classroom NLP: ASR robustness, diarization, evaluating and mitigating bias, discourse modeling, dialogue/tutoring interactions, simplification, misconception detection
- Retrieval-augmented classroom analytics, model adaptation, evaluation for learning-aligned outcomes
Robustness & generalization
- Domain shift beyond the lab, occlusions, noisy data, and missing modalities
- Few-/low-shot learning, continual and on-device adaptation
- Generalization across classroom layouts and populations
Human behavior modeling for learning
- Engagement, attention, affect, confusion, self-regulation, and metacognition
- Collaboration, group dynamics, and teacher–student interactions
- Gaze-informed models, saliency/scanpath prediction, activity recognition
Temporal modeling & intervention
- Sequential/temporal models of learning processes
- Behavioral forecasting, early-warning systems, and interventions
- Real-time inference, feedback, and human-in-the-loop systems
Interpretability, reliability & evaluation
- Interpretable models, uncertainty estimation, and calibration
- OOD detection, fairness, and bias analysis
- Evaluation protocols aligned with learning outcomes
Privacy-aware AI, datasets & deployments
- Privacy-preserving data collection, anonymization, de-identification, and governance
- Annotation strategies, construct-aligned labeling, active learning, synthetic data, and dataset curation
- Classroom-ready systems, scalable multimodal data-collection frameworks, edge/on-device inference, and real-world deployments
We encourage general computer-vision, visually grounded NLP, and human-centered, collaborative AI submissions (e.g., behavioral modeling, pose/activity recognition, gaze estimation, attention modeling, multimodal learning, methods “in the wild”, cognitive state inference and forecasting) that make a clear connection to educational/learning environments (even if primarily in the discussion).
Accepted Papers
Archival Papers
Non-Archival Papers
Workshop Schedule - June 4 - Room 113
Opening and Goals |
|
Keynotes 1 and 2 |
|
Poster Session @ Hall A |
|
Coffee Break |
|
Poster Session @ Hall A (cont.) |
|
Keynotes 3 and 4 |
|
Panel and Community Discussion |
|
Closing and Next Steps |
Venue
700 14th Street
Denver CO 80202
The workshop will be held together with CVPR 2026.
Workshop Organizers
For any questions about the workshop, please contact cv4edu.cvpr@gmail.com
Ekta Sood
Joyce Horn Fonteles
Mariah Bradford
Paul Gavrikov
Prajit Dhar
Janis Pagel
Trisha Mital
Gautam Biswas
Sidney D'Mello
